Combining the Sparsity and Unambiguity Biases for Grammar Induction

نویسنده

Kewei Tu

چکیده

In this paper we describe our participating system for the dependency induction track of the PASCAL Challenge on Grammar Induction. Our system incorporates two types of inductive biases: the sparsity bias and the unambiguity bias. The sparsity bias favors a grammar with fewer grammar rules. The unambiguity bias favors a grammar that leads to unambiguous parses, which is motivated by the observation that natural language is remarkably unambiguous in the sense that the number of plausible parses of a natural language sentence is very small. We introduce our approach to combining these two types of biases and discuss the system implementation. Our experiments show that both types of inductive biases are beneficial to grammar induction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars

We introduce a novel approach named unambiguity regularization for unsupervised learning of probabilistic natural language grammars. The approach is based on the observation that natural language is remarkably unambiguous in the sense that only a tiny portion of the large number of possible parses of a natural language sentence are syntactically valid. We incorporate an inductive bias into gram...

متن کامل

Sparsity in Dependency Grammar Induction

A strong inductive bias is essential in unsupervised grammar induction. We explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. Specifically, we investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In experiments w...

متن کامل

یک مدل بیزی برای استخراج باناظر گرامر زبان طبیعی

In this paper, we show that the problem of grammar induction could be modeled as a combination of several model selection problems. We use the infinite generalization of a Bayesian model of cognition to solve each model selection problem in our grammar induction model. This Bayesian model is capable of solving model selection problems, consistent with human cognition. We also show that using th...

متن کامل

Syntactic islands and learning biases: Combining experimental syntax and computational modeling to investigate the language acquisition problem

2 Abstract The induction problems facing language learners have played a central role in debates about the types of learning biases that exist in the human brain. Many linguists have argued that some of the learning biases necessary to solve these language induction problems must be both innate and language-specific (i.e., the Universal Grammar (UG) hypothesis). Though there have been several r...

متن کامل

Deterministic Cooperating Distributed Grammar Systems

Subclasses of grammar systems that can facilitate parser construction appear to be of interest. In this paper, some syntactical conditions considered for strict deterministic grammars are extended to cooperating distributed grammar systems, restricted to the terminal derivation mode. Two variants are considered according to the level to which the conditions address. The local variant, which int...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Combining the Sparsity and Unambiguity Biases for Grammar Induction

نویسنده

چکیده

منابع مشابه

Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars

Sparsity in Dependency Grammar Induction

یک مدل بیزی برای استخراج باناظر گرامر زبان طبیعی

Syntactic islands and learning biases: Combining experimental syntax and computational modeling to investigate the language acquisition problem

Deterministic Cooperating Distributed Grammar Systems

عنوان ژورنال:

اشتراک گذاری